An Empirical Evaluation of Knowledge Sources and Learning Algorithms for Word Sense Disambiguation
نویسندگان
چکیده
In this paper, we evaluate a variety of knowledge sources and supervised learning algorithms for word sense disambiguation on SENSEVAL-2 and SENSEVAL-1 data. Our knowledge sources include the part-of-speech of neighboring words, single words in the surrounding context, local collocations, and syntactic relations. The learning algorithms evaluated include Support Vector Machines (SVM), Naive Bayes, AdaBoost, and decision tree algorithms. We present empirical results showing the relative contribution of the component knowledge sources and the different learning algorithms. In particular, using all of these knowledge sources and SVM (i.e., a single learning algorithm) achieves accuracy higher than the best official scores on both SENSEVAL-2 and SENSEVAL-1 test data.
منابع مشابه
Word Sense Disambiguation using Optimised Combinations of Knowledge Sources
Word sense disambiguation algorithms, with few exceptions, have made use of only one lexical knowledge source. We describe a system which performs unrestricted word sense disambiguation (on all content words in free text) by combining different knowledge sources: semantic preferences, dictionary definitions and subject/domain codes along with part-of-speech tags. The usefulness of these sources...
متن کاملPsycholinguistics, Lexicography, and Word Sense Disambiguation
Mainstream word sense disambiguation systems have relied mostly on supervised approaches. Complex interactions have been observed between learning algorithms and knowledge sources, but the factors underlying such phenomena are underexplored. This calls for more qualitative analysis of disambiguation results, possibly from an inter-disciplinary perspective. The current study thus preliminarily e...
متن کاملKnowledge sources for disambiguating highly ambiguous verbs in machine translation
Word sense disambiguation (WSD) is one of the most challenging outstanding problems in the current machine translation systems. An effective proposal in this context will rely on the use relevant knowledge sources. Moreover, it must perform better than the current traditional approaches. We present some experiments with machine learning algorithms traditionally applied to WSD, aiming to discove...
متن کاملJoining Forces Pays Off: Multilingual Joint Word Sense Disambiguation
We present a multilingual joint approach to Word Sense Disambiguation (WSD). Our method exploits BabelNet, a very large multilingual knowledge base, to perform graphbased WSD across different languages, and brings together empirical evidence from these languages using ensemble methods. The results show that, thanks to complementing wide-coverage multilingual lexical knowledge with robust graph-...
متن کاملSurvey of Word Sense Disambiguation Approaches
Word Sense Disambiguation (WSD) is an important but challenging technique in the area of natural language processing (NLP). Hundreds of WSD algorithms and systems are available, but less work has been done in regard to choosing the optimal WSD algorithms. This paper summarizes the various knowledge sources used for WSD and classifies existing WSD algorithms according to their techniques. The ra...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002